AITopics | Machine Learning

Collaborating Authors

Machine Learning

"The field of Machine Learning seeks to answer these questions: How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?"
– from The Discipline of Machine Learning by Tom Mitchell. CMU-ML-06-108, 2006.

News Overviews Instructional Materials AI-Alerts Classics

Optimization Can Learn Johnson Lindenstrauss Embeddings

Neural Information Processing SystemsMay-30-2025, 03:26:55 GMT

Embeddings play a pivotal role across various disciplines, offering compact representations of complex data structures. Randomized methods like Johnson-Lindenstrauss (JL) provide state-of-the-art and essentially unimprovable theoretical guarantees for achieving such representations. These guarantees are worst-case and in particular, neither the analysis, nor the algorithm, takes into account any potential structural information of the data. The natural question is: must we randomize? Could we instead use an optimization-based approach, working directly with the data?

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Multi-Stage Predict+Optimize for (Mixed Integer) Linear Programs

Neural Information Processing SystemsMay-30-2025, 03:26:32 GMT

The recently-proposed framework of Predict+Optimize tackles optimization problems with parameters that are unknown at solving time, in a supervised learning setting. Prior frameworks consider only the scenario where all unknown parameters are (eventually) revealed at the same time. In this work, we propose Multi-Stage Predict+Optimize, a novel extension catering to applications where unknown parameters are instead revealed in sequential stages, with optimization decisions made in between. We further develop three training algorithms for neural networks (NNs) for our framework as proof of concept, all of which can handle mixed integer linear programs. The first baseline algorithm is a natural extension of prior work, training a single NN which makes a single prediction of unknown parameters.

artificial intelligence, baseline, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.14)
North America > United States > California (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Banking & Finance (1.00)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization Julien Roy

Neural Information Processing SystemsMay-30-2025, 03:25:54 GMT

Adversarial Imitation Learning alternates between learning a discriminator - which tells apart expert's demonstrations from generated ones - and a generator's policy to produce trajectories that can fool this discriminator. This alternated optimization is known to be delicate in practice since it compounds unstable adversarial training with brittle and sample-inefficient reinforcement learning. We propose to remove the burden of the policy optimization steps by leveraging a novel discriminator formulation. Specifically, our discriminator is explicitly conditioned on two policies: the one from the previous generator's iteration and a learnable policy. When optimized, this discriminator directly learns the optimal generator's policy. Consequently, our discriminator's update solves the generator's optimization problem for free: learning a policy that imitates the expert does not require an additional optimization loop. This formulation effectively cuts by half the implementation and computational burden of Adversarial Imitation Learning algorithms by removing the Reinforcement Learning phase altogether. We show on a variety of tasks that our simpler approach is competitive to prevalent Imitation Learning methods.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.15)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

We would like to thank the reviewers for their thorough evaluations and for bringing to our attention some missing

Neural Information Processing SystemsMay-30-2025, 03:25:43 GMT

Contrarily, ASAF-1 uses the binary cross entropy loss in Eq. (13) and does not suffer from compounding Can windowed approach be used for GAIL and AIRL? (R1). Reward Acquisition from ASAF (R3). Our observation is that ASAF-1 is always fastest to learn, e.g., 361.2s Note however that reports of performance w.r.t wall-clock time should always be taken We learn in the softmax policy class of Eq. (2) since it contains the expert's Unfortunately, we are not allowed to put an external link in this rebuttal.

artificial intelligence, asaf-1, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding Yitong Dong 1 Yijin Li1 Zhaoyang Huang

Neural Information Processing SystemsMay-30-2025, 03:25:12 GMT

In this paper, we propose a novel multi-view stereo (MVS) framework that gets rid of the depth range prior. Unlike recent prior-free MVS methods that work in a pair-wise manner, our method simultaneously considers all the source images. Specifically, we introduce a Multi-view Disparity Attention (MDA) module to aggregate long-range context information within and across multi-view images. Considering the asymmetry of the epipolar disparity flow, the key to our method lies in accurately modeling multi-view geometric constraints. We integrate pose embedding to encapsulate information such as multi-view camera poses, providing implicit geometric constraints for multi-view disparity feature fusion dominated by attention. Additionally, we construct corresponding hidden states for each source image due to significant differences in the observation quality of the same pixel in the reference frame across multiple source frames. We explicitly estimate the quality of the current pixel corresponding to sampled points on the epipolar line of the source image and dynamically update hidden states through the uncertainty estimation module. Extensive results on the DTU dataset and Tanks&Temple benchmark demonstrate the effectiveness of our method.

computer vision, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: Europe > Netherlands (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Supplementary Material: Progressive Kernel Based Knowledge Distillation for Adder Neural Networks

Neural Information Processing SystemsMay-30-2025, 03:24:30 GMT

Given two input vector x and f, the result of convolutional operation on a specific point of the image is the dot product of two vectors. Thus, Eq.(7) in the main paper can be written as: Thus, the transformation in Eq.(7) in the main paper can be expressed as a linear combination of infinite kernel functions, which means the output space is mapped to an infinite dimensional space. Also note that when n, L also goes to infinity, which means that the input space is mapped to an infinite dimensional space. In this section, more experimental results of PKKD are conducted. We compared the proposed method with other methods, such as ANN+dropout, Snapshot-KD [3], SP-KD [2], Gift-KD [4] and AT [5] on ResNet-20 using CIFAR-10 dataset as shown in Tab. 1. Table 1: Compared with other methods on ResNet-20 using CIFAR-10 dataset.

artificial intelligence, knowledge distillation, machine learning, (9 more...)

Neural Information Processing Systems

Country: North America > Canada (0.16)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

Kernel Based Progressive Distillation for Adder Neural Networks

Neural Information Processing SystemsMay-30-2025, 03:24:23 GMT

Adder Neural Networks (ANNs) which only contain additions bring us a new way of developing deep neural networks with low energy consumption. Unfortunately, there is an accuracy drop when replacing all convolution filters by adder filters.

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report (0.46)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Supplementary Material for CableInspect-AD: An Expert-Annotated Anomaly Detection Dataset

Neural Information Processing SystemsMay-30-2025, 03:09:22 GMT

Figure S4: The image shows an example of CableInspect-AD_cropped. We create CableInspect-AD_cropped dataset, containing the images with the background removed, keeping only the central part of the cables. The dataset was generated by extracting a central band of size 224 1120 as shown in fig.

data mining, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Industry: Law (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.70)
(2 more...)

Add feedback

CableInspect-AD: An Expert-Annotated Anomaly Detection Dataset Margaux Luck 1 Aldo Zaimi 1

Neural Information Processing SystemsMay-30-2025, 03:09:18 GMT

Machine learning models are increasingly being deployed in real-world contexts. However, systematic studies on their transferability to specific and critical applications are underrepresented in the research literature. An important example is visual anomaly detection (VAD) for robotic power line inspection. While existing VAD methods perform well in controlled environments, real-world scenarios present diverse and unexpected anomalies that current datasets fail to capture. To address this gap, we introduce CableInspect-AD, a high-quality, publicly available dataset created and annotated by domain experts from Hydro-Québec, a Canadian public utility.

anomaly, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Oceania (0.28)
North America > Canada > Quebec (0.25)

Industry: Energy > Power Industry > Utilities (0.55)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(3 more...)

Add feedback

Appendix for: Invertible Gaussian Reparameterization

Neural Information Processing SystemsMay-30-2025, 03:08:48 GMT

As mentioned in section 3.1, we can use the matrix determinant lemma to efficiently compute the determinant of the Jacobian of the softmax Proof: For k = 1,..., K 1, we have: P(H = k) = Note that the involved integrals are one-dimensional and thus can be accurately approximated with quadrature methods. As mentioned in the main manuscript, our VAE experiments closely follow Maddison et al. [4]: we use the same continuous objective and the same evaluation metrics. Using the former KL results in optimizing a continuous objective which is not a log-likelihood lower bound anymore, which is mainly why we followed Maddison et al. [4]. In addition to the reported comparisons in the main manuscript, we include further comparisons in Table 1 reporting the discretized training ELBO instead. These are variance reduction techniques which heavily lean on the GS to improve the variance of the obtained gradients.

approximation, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback